A State-Compensated Deep Deterministic Policy Gradient Algorithm for UAV Trajectory Tracking

نویسندگان

چکیده

The unmanned aerial vehicle (UAV) trajectory tracking control algorithm based on deep reinforcement learning is generally inefficient for training in an unknown environment, and the convergence unstable. Aiming at this situation, a Markov decision process (MDP) model UAV established, state-compensated deterministic policy gradient (CDDPG) proposed. An additional neural network (C-Net) whose input compensation state output action added to of (DDPG) assist exploration training. It combined DDPG with compensated C-Net as interact enabling rapidly track dynamic targets most accurate continuous smooth way possible. In addition, random noise basis generated behavior realize certain range make value estimation more accurate. OpenAI Gym tool used verify proposed method, simulation results show that: (1) method can significantly improve efficiency by adding effectively accuracy stability; (2) Under same computer configuration, computational cost basically that QAC (Actor-critic behavioral Q) algorithm; (3) During process, accuracy, about 70% higher than DDPG; (4) experiment, under time, error after stabilization 50% lower DDPG.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Deterministic Policy Gradient for Urban Traffic Light Control

Traffic light timing optimization is still an active line of research despite the wealth of scientific literature on the topic, and the problem remains unsolved for any non-toy scenario. One of the key issues with traffic light optimization is the large scale of the input information that is available for the controlling agent, namely all the traffic data that is continually sampled by the traf...

متن کامل

Deterministic Policy Gradient Algorithms

In this paper we consider deterministic policy gradient algorithms for reinforcement learning with continuous actions. The deterministic policy gradient has a particularly appealing form: it is the expected gradient of the action-value function. This simple form means that the deterministic policy gradient can be estimated much more efficiently than the usual stochastic policy gradient. To ensu...

متن کامل

Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning

Deep reinforcement learning for multi-agent cooperation and competition has been a hot topic recently. This paper focuses on cooperative multi-agent problem based on actor-critic methods under local observations settings. Multi agent deep deterministic policy gradient obtained state of art results for some multi-agent games, whereas, it cannot scale well with growing amount of agents. In order ...

متن کامل

Trajectory Tracking of a Mobile Robot Using Fuzzy Logic Tuned by Genetic Algorithm (TECHNICAL NOTE)

In recent years, soft computing methods, like fuzzy logic and neural networks have been  presented and developed for the purpose of mobile robot trajectory tracking. In this paper we will present a fuzzy approach to the problem of mobile robot path tracking for the CEDRA rescue robot with a complicated kinematical model. After designing the fuzzy tracking controller, the membership functions an...

متن کامل

Quadrotor UAV Guidence For Ground Moving Target Tracking

The studies in aerial vehicles modeling and control have been increased rapidly recently. In this paper , a coordination of two types of heterogeneous robots , namely unmanned aerial vehicle (UAV) and unmanned ground vehicle (UGV) is considered. In this paper the UAV plays the role of a virtual leader for the UGVs. The system consists of a vision- based target detection algorithm that uses the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machines

سال: 2022

ISSN: ['2075-1702']

DOI: https://doi.org/10.3390/machines10070496